perm filename FILES.PRO[S79,JMC] blob
sn#447455 filedate 1979-06-06 generic text, type C, neo UTF8
COMMENT ā VALID 00002 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 .s Describing other People's Files
C00008 ENDMK
Cā;
.s Describing other People's Files
//pers John McCarthy
This part of the proposal is to develop a language for describing
existing files over whose format the describer has no control.
The increase in communication facilities between computers
produces many opportunities for organizations to use each others files,
but few of the opportunities have been realized. For example, Lawrence
Berkeley Laboratory has a file of census data, but it is not a routine
operation for programs all over the country to use this data. Another
example is that almost every ARPAnet installation maintains a local file
of names and addresses, but it requires special knowledge for a user of
Stanford's computer to get at the address and telephone files at the
M.I.T. Artificial Intelligence Laboratory.
One approach to solving such problems is for everyone to agree to
use common formats for certain kinds of files, i.e. for one data base
scheme to "conquer the world". This isn't going to happen. Files are
developed at differing times at different technological levels and with
different hardware. Co-ordination within a single installation is difficult
to achieve, and larger scale co-ordination is achieved only in connection
with the development of new systems.
This proposed work is based on accepting the idea that many installations
and many applications within installations will develop their own file formats.
To cope with this we propose to develop a widely applicable system for
formally describing existing file formats. Given such a description,
a general program can interpret it in order to extract information from
or even put information into other people's files.
Consider the address and telephone files at the various
laboratories. If we do our job well, each such file will be describable
in our File Description Language (FiDL). A program can then be asked for
the telephone number of Patrick Winston at M.I.T., will read the file
description, call M.I.T. on the ARPAnet and extract the desired
information. Since most of the installations include additional
information in the file much of which is not interesting to outsiders,
FiDL should provide for partial description of files.
The Lawrence Berkeley Laboratory census files provide
an interesting different problem. We want to describe them in such a way
that a user anywhere on the ARPAnet can ask a local program for the
1970 population of Pittsburgh and get an answer from a program that
knows about the LBL file. The user himself need not know where the
information is coming from. However, the LBL census file is intended
to be interrogated by a human interactively and not by a computer. Therefore,
FiDL must provide for the formal description of the interaction method
so that the program using it can pretend to be a human and engage in a
suitable dialog with the LBL program.
We propose to devote one research associate to this problem
for the two year period of the contract.
He may be assisted by graduate
students of suitable talents and interests.
The work will result in a program
that can be used from anywhere in the ARPAnet and will extract information
from files on the ARPAnet for which descriptions have been made. We plan
to use the address and telephone files as our first experimental domain.
Costs are detailed in the budget.